Improved Deterministic Length Reduction 



oo 
O 

o 

(N 

c 



Amihood Amir* 
Bar-Ilan University 
and 

Johns Hopkins University 

Ely Porat § 
Bar-Ilan University 



Klim Efremenko t Oren Kapah* 
Bar-Ilan University Bar-Ilan University 



Amir Rothschild^ 

Bar-Ilan University 



q 

o 



> 

o 
o 

o 

oo 
O 



Abstract 

This paper presents a new technique for deterministic length reduction. This technique 
improves the running time of the algorithm presented in [?] for performing fast convolution 
in sparse data. While the regular fast convolution of vectors V\ , Vi whose sizes are N\ , N 2 
respectively, takes 0(Ni log ^2) using FFT, using the new technique for length reduction, the 
algorithm proposed in [?] performs the convolution in 0(ni log 3 ni), where n\ is the number 
of non-zero values in V\. The algorithm assumes that V± is given in advance, and V2 is given 
in running time. The novel technique presented in this paper improves the convolution time 
to 0(nilog 2 ni) deterministically, which equals the best running time given achieved by a 
randomized algorithm. 

The preprocessing time of the new technique remains the same as the preprocessing time 
of [?], which is 0(n\). This assumes and deals the case where N% is polynomial in n%. In the 
case where N% is exponential in m, a reduction to a polynomial case can be used. In this paper 
we also improve the preprocessing time of this reduction from 0(nf) to 0(nfpolylog(ni)). 
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1 Introduction 

The d- Dimensional point set matching problem serves as powerful tools in numerous application 
domains. In the d-Dimensional point set matching problem, two sets of points T, P E N rf consisting 
of n, m points, respectively, are given. The goal is to determine if there is a rigid transformation 
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under which all the points in P are covered with points in T. Among the important application 
domains to which this problem contributes are model based object recognition, image registration, 
pharmacophore identification, and searching in music archives. For an explanation of the uses of 
the point-set matching problem in these domains see [?]. 

The point-set matching problem has been studied in the literature in many variation, not the least 
of which in the algorithms literature. In [?] Cardoze and Schulman used a randomized algorithm 
to reduce the space size of T, P and then apply solve the problem in the reduced space. In [1] 
Cole and Hariharan proposed a solution to the d- Dimensional Sparse Wildcard Matching. This 
is a generalization of the d-Dimensional point set matching problem where every point in N d is 
associated with a value. A match is declared if the values of coinciding points are equal. The 
Cole and Hariharan solution consists of two steps. The first step is a Dimension Reduction where 
the inputs T, P are linearized into raw vectors T", P' of size polynomial in the number of non-zero 
values. The second step was a Length Reduction where each of the raw vectors T' , P' was replaced 
by log n short vectors of size 0(n) where n is the number of non-zeros. The idea is that the mapping 
to the short vectors preserves the distances in the original vectors, thus the problem is reduced to a 
matching problem of short vectors, to which efficient solutions exist. The problem with the length 
reduction idea is that more then one point can be mapped into the same location, thus it is no 
longer clear whether there is indeed a match in the original vectors. The proposed solution of Cole 
and Hariharan was to create a set of log n pairs of vectors using log n hash function rather then a 
single pair of vectors. Their scheme reduced the failure probability. 

In [?], the first deterministic algorithm for finding logn hash functions that reduce the size 
of the vectors to O(nlogn) was presented. The algorithm guaranteed that each non-zero value 
appears with no collisions in at least one of the vectors, thus eliminating the possibility of en error. 
The length reduction idea was used to solve the Sparse Convolution problem posed in [3], where 
the aim is to find the convolution vector W of two vectors V\ , V2 whose sizes are N\,N2, with 
nx,ri2 non-zero elements respectively (where n\ > 77-2). It is assumed that the two vectors are 
not given explicitly, rather they are given as a set of (index, value) pairs. Using the Fast Fourier 
Transform (FFT) algorithm, the convolution can be calculated in running time 0(N\ log N2) [2] • In 
our context, though, the vectors V\, V2 are sparse. The aim of the algorithm is to compute W in 
time proportional to the number of non-zero entries in W, which may be significantly smaller than 
O(Ni). Clearly, this can be easily done in time 0(n\n2)- 

The goal of the length reduction is as follows: Given two vectors Vi,V2 whose sizes are Ni,N2, 
with n\,n2 non-zero elements respectively (where n\ > 712), obtain two vectors V{, V 2 ' of size 0(n\) 
such that all the non-zero in V\ and in v 2 will appear as singletons in V{ and in V 2 ' respectively 
while maintaining the distance property. 

The distance property which need to be maintained is defined as follows: If V^'[/(0)] is aligned with 
V([f(i)\, then V 2 '[f(j)} will be aligned with V([f(i + j)]. 

This goal was not reached yet, rather a set of O(logni) vectors of size 0(n\ lograi) where obtained 
in [?], where each non-zero in the text appears at least once as a singleton in the set of vectors. 
This length reduction gave an 0(n\ log 3 n\) algorithm for convolution in sparse data. In this paper 
we go one step forward and reduce the size of the obtained vectors to 0(n\). This length reduction 
technique improves the running time of the fast convolution presented in [?] to 0(n\ log 2 n-i), which 
is the running time for the randomized algorithm presented in pQ. 
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2 Preliminaries and Notations 



Throughout this paper, a capital letter (usually N) is used to denote the size of the vector, which is 
equivalent to the largest index of a non-zero value, and a small letter (usually n) is used to denote 
the number of non-zero values. It is assumed that the vectors are not given explicitly, rather they 
are given as a set of {index, value) pairs, for all the non-zero values. 

A convolution uses two initial functions, v\ and V2, to produce a third function w. We formally 
define a discrete convolution. 

Definition 1 Let V\ be a function whose domain is {0, N± — 1} and V2 a, function whose domain 
is {0, ...,N2 — 1}. We may view V\ and V2 as arrays of numbers, whose lengths are N\ and N2, 
respectively. The discrete convolution of V% and V2 is the polynomial multiplication 

N2-1 

W[j]= Vi\j + i]V 2 \i]. 

i=0 

In the general case, the convolution can be computed by using the Fast Fourier Transform (FFT) [2]. 
This can be done in time 0(N\ log N2), in a computational model with word size 0(log iVg). In the 
sparse case, many values of V\ and V2 are 0. Thus, they do not contribute to the convolution value. 
In our convention, the number of non-zero values of Vi(V2) is n\(ri2)- Clearly, we can compute the 
convolution in time 0(n\n2)- The question posed by Muthukrishnan [3] is whether the convolution 
can be computed in time 0(711712). 

Cole and Hariharan's suggestion was to use length reduction. Suppose we can map all the non- 
zero values into a smaller vector, say of size 0(ni logTti). Suppose also that this mapping is 
alignment preserving in the sense that applying the same transformation on V2 will guarantee that 
the alignments are preserved. Then we can simply map the the vectors V\ and V2 into the smaller 
vectors and then use FFT for the convolutions on the smaller vectors, achieving time 0(n\ log 2 711). 

The problem is that to-date there is no known mapping with that alignment preserving property. 
Cole and Hariharan pQ suggested a randomized idea that answers the problem with high probability. 
The reason their algorithm is not deterministic is the following: In their length reduction phase, 
several indices of non-zero values in the original vector may be mapped into the same index in the 
reduced size vector. If the index of only one non-zero value is mapped into an index in the reduced 
size vector, then this index is denoted as singleton and the non-zero value is said to appear as a 
singleton. If more then one non-zero value is mapped into the same index in the reduced size vector, 
then this index is denoted as multiple. The multiple case is problematic since we can not be sure of 
the right alignment. Fortunately, Cole and Hariharan showed a method whereby in O(logni) tries, 
the probability that some index will always be in a multiple situation is small. In [?], a deterministic 
solution to the multiple problem was presented. That solution utilized number theoretic ideas. The 
new idea of this paper is to improve the reduction size by using polynomials to represent the location 
of the non-0 elements of the given vectors. 
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3 The New Length Reduction Technique for the Polynomial Case 



The proposed technique deals with the case that N\ is polynomial in m, thus the indices are 
bounded by n\. In the case where, N\ is exponential in m, the reduction to a polynomial case can 
be used. 

The main idea of the algorithm is to derive a set of unique polynomials from each non-zero index in 
Vi, and one polynomial for each non-zero in V2. Each assignment for the polynomials in ¥ q , where 
q is a prime number of size 0(ni) will give a different mapping of the non-zeros in V\ and in V2 to 
vectors of size q. The convolution will be performed between the vectors obtained from V\ and V2 
under the same assignments. 

The first step of the algorithm is to choose a prime number of size 0(ni), and create a polynomial for 
each non-zero index in V\. The created polynomial of index i will be denoted as the base polynomial 
of T[i]. The creation of the polynomial is done by representing the index as a number in base ( - g 2 1 ^ . 
Each digit is interpreted as a coefficient of the polynomial. For example: If q = 13, then index 95 
in base 10 is 235 in base ^ 13 2 " 1 - ) = 6 which is represented by the polynomial 2X 2 + 3X + 5. 

Since the indices in V\ are bounded by n\, and q is 0(ni), then the degree of the polynomials 
which created in this step is bounded by c. In the next step, from each polynomial we create 2° 
polynomials. This is done by giving to choices for each coefficient of the polynomial: (1) Leave it 
as is. (2) Add ^ q ~^ to the coefficient and decrease by 1 the coefficient of the higher degree. We do 
this for all the coefficients of the polynomial except for the coefficient of the highest degree. 

Example 1 Suppose we have a non-zero index 95, using q = 13 we get the base polynomial 2X 2 + 
3X + 5. After the second step we will obtain 4 polynomials: 2X 2 + 3X + 5, 2X 2 + 2X + 11, X 2 + 
9X + 5,X 2 + 8X + 11. 

The first polynomial is the base polynomial. The second polynomial was obtained by adding 6 to 
the first coefficient and decreasing the second coefficient by one. The 3rd and the 4th polynomials 
were created by adding 6 to the second coefficient of the first and second polynomials respectively, 
and decreasing the third coefficient by one. 

The duplication of the polynomials was made to meet the distance preserving requirement from 
the length reduction specified in the following Lemma: 

Lemma 1 For any assignment of X, i/V^fO] is aligned with the base polynomial representing V\[i], 
then V2 [j] will be aligned with one of the polynomials representing V\ [i + j] . 

Proof: Let q be the chosen prime number. Index in V2 is represented by the polynomial 0, and 
index j in V2 is represented by the a polynomial A = a c X c + a c _iX c_1 + ... + ao- Index i in V\ is 
represented by a polynomial of the form B = b c X c + b c -\X c ~ x + ... + bo, and index i + j in V\ is 
represented by a polynomial D = d c X c + dc-iX ^ 1 + ... + do. Note that the coefficients a« and bi 
are smaller then ( - 9 ~ 1 - > . 

Clearly, if V^fO] is aligned with Vji], then for any assignment of X, V2U] will be aligned with the 
polynomial A + B = (a c + b c )X c + (a c _i + 6 c _i)X c_1 + ... + (ao + bo). Now lets look at the 
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first coefficient of D, since ao and bo are smaller then ^ ' , then there are only two cases: (1) 




(ao + bo) < 2 , tnus = «o + bo- (2) (ao + bo) >= 2 , thus = a o + &o _ 2 which is 
covered by the polynomial where was added to the first coefficient. 

In the later case, one was added to the second coefficient, thus we decrease the next coefficient 
whenever we add ^2 ^° ^ ne curren t coefficient. The same cases exist also in all the coefficient, 
but a polynomial was created for each possible case (2 C cases), thus one of the created polynomials 
will be equal to the polynomial A + B. □ 

Note that all the 2 C x n\ created polynomials are unique, and in ¥ q . Assigning a value to the 
polynomials in ¥ q will give a vector of size q. 

Lemma 2 Any two polynomials can be mapped to the same location in at most c assignments. 

Proof: The distance between any two polynomials gives a polynomial, where the degree of the 
difference polynomial is bounded by c. Since both polynomials give the same index under the 
selected assignment, then the assigned value is a root of the difference polynomial. The degree of 
this polynomial is bounded by c, thus it can have at most c different roots in ¥ q . □ 

Since any polynomial can be mapped into the same location with at most 2 C x n\ — 1 other 
polynomials, and with each of them at most c times, due to Lemma El then we get the following 
Corollary: 

Corollary 1 Any polynomial can appear as a multiple in not more then c x 2 C x n\ vectors. 

The last step of the length reduction algorithm is to find a set of O(logni) assignments which will 
ensure that each polynomial will appear as a singleton at least once. 

The selection of the O(logrii) assignments is done as follows: Construct table A with 2 C x n\ 
columns and c x 2 C+1 x n\ rows. Row i correspond to an assigned value a{ and the corresponding 
reduced length vector V\^. A column corresponds to a polynomial Pj. The value of is set to 
1 if polynomial j appears as a singleton in vector V\ j. Due to Corollary [H the number of zeros in 
each column can not exceed c x 2 C x n\. Thus, in each column there are l's in at least half of the 
rows, which means that the table is at least half full. Since the table is at least half full there exists 
a row in which there is one in at least half of the columns. The assignment value which generated 
this row is chosen, and all the columns where there was a 1 in the selected row are deleted from 
the table. 

Recursively another assignment value is chosen and the table size is halved again, until all the 
columns are deleted. Since at each step at least half of the columns are deleted, the number of 
prime number chosen can not exceed log(2 c x n\) = clogni. 

Time: Creating vector Vi t i (row i) takes 0{n\) time. Since we start with a full matrix of 0{n\) 
rows then the initialization takes 0{n\) time. Choosing the O(logni) assignment values is done 
recursively. The recurrence is: 
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4 The New Algorithm for The Exponential Case 



In this case, as proposed in [?], each of the vectors V\ and V2 is reduced into a single vector of size 
0(nf), where all the non-zeros appear as singletons. The reduction is preformed using the modulus 
function with a prime number q of size 0(nf). It was already proven there that there are at most 
n\ prime number of size 0(nf), which generate at least one multiple. Thus, by testing n\ + 1 prime 
numbers we ensure that at least one of them produce a vector with no multiples. 

In order to find such a prime number, we find n\ + 1 of size 0(nf). Then we multiply all the 
prime numbers to receive a large number Q. In addition we have at most n\ different distances 
between any two non-zeros. We multiply all of them to receive the large number D. The next 
step is to find the greatest common divider {GCD) between Q and D. Since there is at least 
one prime number in Q which does not divide D, then GCD(Q, D) is less then Q. Dividing Q 
by the GCD(Q, D) will give P which is the multiplication of all the prime numbers that create 
only singletons. The last step is to find at least one of them. This is done using a binary search 
on the prime numbers. We take the multiplication of half of the prime numbers Q', and find the 
GCD(Q', P). If GCD(Q', P) > 1 we continue with this set of prime numbers and multiply half of 
them iteratively. Otherwise, we continue with the other half of the prime numbers. After 0(log n\) 
iterations we will find one prime number which will generate only singletons. 

The algorithm appears in detail below. 

Algorithm — N\ is exponential in n\ 

1. Find n\ + 1 prime numbers of size 0(nf). 

2. Multiply all the prime numbers to obtain Q. 

3. Multiply all the difference between any two non-zero indices to obtain D. 



^- JCL 1 ~ GCD(Q,D) ■ 

5. Let S be the set of all prime numbers. 

6. While the size of S is larger then 1 do: 

(a) Let S' be a set of the first half of prime numbers in S. 

(b) Set Q' to be the multiplication of all the prime numbers in S' . 

(c) If GCD(Q', P) > 1 then set S = S', otherwise set S = S/S'. 

end Algorithm 
Correctness: Immediately follows from the discussion. 

Time: Step 1 is performed in time 0(nfpolylog(ni)) using the primality testing described in [?]. 
Step 2 is done by building a binary tree of multiplication where each node contain the multiplication 
of the two number in the lower level. This tree has O(logni) levels. In the leaves there are 
n\ prime numbers with logrii bits, so the total number of bits in each level is 0(nf log rii). A 
multiplication of two numbers can be computed in time 0(b log b log log b) [4|, where b is the 
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number of bits. Thus each level can be computed in time 0(nfpolylog(ni)) and the total time for 
step 2 is 0(nfpolylog(ni)). step 3 is preformed in the same way, but this time in the leaves there 
are n\ numbers with n\ bits, thus each level has n\ bits and the time for this step is 0{n\ logni). In 
step 4 we calculate the GCD of two numbers with 0(n\ logni) bits. This can be calculated in time 
0(nfpolylog(ni)) using [?]. The calculation for step 6(b) was already performed in step 2, and 
step 6(c) can be calculated in time 0(n 3 polylog(rii)), thus the time of step 6 is 0(nfpolylog(ni)). 
Following this discussion the total time of this algorithm is 0(nfpolylog(ni)). 

5 Conclusion and Open Problems 

Improved deterministic algorithms for Length Reduction and Sparse Convolution where presented 
in this paper. These can be used as tools to provide faster algorithms for several well known 
problems. The deterministic time achieved for convolving input patterns with a fixed text is the 
same as the best known randomized algorithm. 

An important problem remains: Can the Length Reduction and Sparse Convolution problems be 
solved in real time without the need of the preprocessing step, or alternately, can the preprocessing 
time be reduced from quadratic? 
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